Back to Basics: CLASSY 2006

نویسندگان

  • John M. Conroy
  • Judith D. Schlesinger
  • Dianne P. O’Leary
  • Jade Goldstein
چکیده

The IDA/CCS summarization system, CLASSY, underwent significant change for this year’s DUC. Two changes made processing simpler and faster: 1) we eliminated the use of a POS (part of speech) tagger for sentence splitting and to assist sentence trimming, and 2) we simplified the scoring of sentences for inclusion in the summary by introducing a new “approximate oracle” score. An additional change introduced a modest amount of extra computation: we ordered sentences in the summary using a new Traveling Salesperson (TSP) formulation. These changes improved ROUGE scores on the DUC 2005 data from last year and gave strong performance in the DUC 2006 competition.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CLASSY Arabic and English Multi-Document Summarization

Our Multilingual Summarization Evaluation entries for MSE-2006 were based upon an improved version of our CLASSY (Clustering, Linguistics, And Statistics for Summarization Yield) system. Our two entries were systems 20 and 21 and represented approaches based upon extracts from a) only English documents and b) English and the translated Arabic documents (full clusters). This paper presents a bri...

متن کامل

Getting Back to Basics

Advances in understanding basic developmental and physiological processes often have direct relevance to human disease. They provide insights into pathogenic mechanisms and reveal new pathways that can be exploited in diagnosis and the development of therapeutics.

متن کامل

CLASSY 2009: Summarization and Metrics

This year the CLASSY team participated in the update summary task and made four submissions to summarization evaluation (AESOP). Our AESOP submissions used combinations of ROUGE scores along with an update (or newness) score. We also use these new metrics, which we call Nouveau ROUGE, to help train our system and evaluate new ideas on computing update summaries. CLASSY (Clustering, Linguistics,...

متن کامل

Arabic/English Multi-document Summarization with CLASSY - The Past and the Future

Automatic document summarization has become increasingly important due to the quantity of written material generated worldwide. Generating good quality summaries enables users to cope with larger amounts of information. English-document summarization is a difficult task. Yet it is not sufficient. Environmental, economic, and other global issues make it imperative for English speakers to underst...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006